Links
dtSearch Text Retrieval Engine Programmer's Reference 7.70
IOptions::UnicodeFilterWordOverlapAmount Property
IOptions Interface | IOptions Interface | Send Feedback

Amount of overlap when automatically breaking words when applying the Unicode Filtering algorithm.

__property long UnicodeFilterWordOverlapAmount;
Description

 

Unicode Filtering can automatically break long runs of letters into words each time more than Options.MaxWordLength consecutive letters are found. By default, a word break is inserted and the next word starts with the following character. Set UnicodeFilterWordOverlapAmount and also set the dtsoUfAutoWordBreakOverlapWords flag in UnicodeFilterFlags to start the next word before the end of the previous word. 

For example, suppose the maximum word length is set to 8, and the following run of letters is found: aaaaahiddenaaaaa. By default, this would be indexed as aaaaahid and denaaaa, which means that a search for *hidden* would not find it. With a word overlap of 4, this would be indexed as: aaaaahid, ahiddena, denaaaaa which would allow the embedded word "hidden" to be found in a search for *hidden*.

Interface
Links
You are here: COM Interface > Interfaces > IOptions Interface > IOptions::UnicodeFilterWordOverlapAmount Property
Copyright (c) 1995-2012 dtSearch Corp. All rights reserved.